SingleCellAlleleExperiment 0.0.0.9000
Immune molecules such as B and T cell receptors, human leukocyte antigens (HLAs) or killer Ig-like receptors (KIRs) are encoded in the genetically most diverse loci of the human genome. Many of these immune genes are hyperpolymorphic, showing high allelic diversity across human populations. In addition, typical immune molecules are polygenic, which means that multiple functionally similar genes encode the same protein subunit.
We have developed a workflow, that allows quantification and interactive exploration of
donor-specific alleles of different immune genes. The workflow is composed of three steps:
1. Typing of donor-specific alleles,
2. Quantification of these alleles in single-cell transcriptomic data,
3. exploration of single-cell transcriptomic data using SingleCellAlleleExperiment (SCAE) class (Fig. 1).
The aim of the here presented SingleCellAlleleExperiment (SCAE)
class is to provide an addition to the SingleCellExperiment object that
can simultaneously contain quantification of alleles, genes, and groups of functionally similar genes
and thus allows data analysis across these immunologically relevent different layers of annotation.
SingleCellAlleleExperiment (SCAE) classThe SingleCellAlleleExperiment (SCAE) class is a container for storing and handling allele-aware quantification data for immune genes. The SCAE class is derived
from the SingleCellExperiment(SCE) class and uses the same overall object architecture. However, multiple data layers are integrated into the object during object generation.
Data from a novel allele-aware quantification method contains information about alleles for immune genes. During object generation, the allele information is aggregated into two additional data layers
via an ontology based MHC design principle and appended to the initial raw data using a lookup table. Thus the final SCAE object contains non-immune genes, alleles for immune genes, an immune gene layer and a functional class layer (refer to Figure 1).
For example, the counts of the alleles A*01:01:01:01 and A*02:01:01:01 that are present in the raw input data will be summed up and transformed into the HLA-A immune gene layer. Next, all counts for the present HLA-class I immune genes will be summed up and transformed into the HLA-class I functional class layer. The information necessary to perform these transformations is saved in a lookup table retrieved from the IPD-IMGT/HLA database.
The implemented object follows similar conventions like the SCE class, where rows should represent features (genes, transcripts) and columns should represent cells. Established single cell packages like scater and scran can be used with the SCAE object to perform downstream analysis on immune gene expression.
This allows new insights on functional as well as allele level to uncover the high diversity of immune genes.
Figure 1: Scheme of SingleCellAlleleExperiment object structure with lookup table.
The read in function of the SCAE package readAlleleCounts expects specific files and file identifiers.
The stated input directory should contain the following files:
The used dataset for later downstream analysis and testing of the functionalities of the multi-layer object is a dual-center, two-cohort study where whole blood and peripheral blood mononuclear cells underwent scRNA-sequencing. The whole transcriptome dataset is under controlled access and not for public use. The corresponding publication Severe COVID-19 Is Marked by a Dysregulated Myeloid Cell Compartment can be found here.
The following packages are abundant for performing the downstream analysis and visualization.
library(SingleCellAlleleExperiment)
library(scran)
library(scater)
library(gridExtra)
library(tidyverse)
library(patchwork)
library(cowplot)
library(ggplot2)
library(viridis)
First the directory path is stated. The directory should contain all expected files.
dir_path <- "../inst/extdata"
list.files("../inst/extdata")
## [1] "cells_x_genes.barcodes.txt" "cells_x_genes.genes.txt"
## [3] "cells_x_genes.mtx" "intro.html"
## [5] "lookup_table_HLA_only.csv" "scae_advanced.png"
SCAE objects implements the same object architecture as SCE objects. When reading allele counts, the method by default filters cells from a knee-plot inflection point. Alternatively, manual filter settings can be provided.
scae <- readAlleleCounts(dir_path,
sample.names = "EGA_WTA",
filter = TRUE,
exp_type = "WTA",
symbols = "biomart")
## Warning: Removed 2555 rows containing missing values (`geom_line()`).
## Filtering performed based on the inflection point at: 105 UMI counts.[1] "Runtime check (1/2) Read_in: 3.76858973503113"
## [1] " Generating SCAE (1/5) extending rowData: 4.0276985168457"
## [1] " Generating SCAE (2/5) filtering and normalization: 0.13380765914917"
## [1] " Generating SCAE (3/5) alleles2genes: 1.39652752876282"
## [1] " Generating SCAE (4/5) genes2functional: 1.38803172111511"
## [1] " Generating SCAE (5/5) log_transform: 1.38803172111511"
## [1] "Runtime check (2/2) Generating SCAE completed: 7.30348300933838"
## [1] "Total runtime, completed read_in, filtering and normalization and generating scae object 11.0722854137421"
scae
## class: SingleCellAlleleExperiment
## dim: 62718 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(62718): ENSG00000160072.20 ENSG00000279928.2 ... HLA_class_I
## HLA_class_II
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
Two new classification columns are introduced in the rowData slot. Namely the NI_I column and Quant_type column. Both columns are used to identify each row of the object to its corresponding data layer (see figure 1).
rowData(scae)
## DataFrame with 62718 rows and 4 columns
## Ensembl.ID Symbol NI_I Quant_type
## <character> <character> <character> <character>
## ENSG00000160072.20 ENSG00000160072.20 ATAD3B NI G
## ENSG00000279928.2 ENSG00000279928.2 DDX11L17 NI G
## ENSG00000228037.1 ENSG00000228037.1 NA NI G
## ENSG00000142611.17 ENSG00000142611.17 PRDM16 NI G
## ENSG00000284616.1 ENSG00000284616.1 NA NI G
## ... ... ... ... ...
## HLA-DQA1 HLA-DQA1 HLA-DQA1 I G
## HLA-DQB1 HLA-DQB1 HLA-DQB1 I G
## HLA-DPB1 HLA-DPB1 HLA-DPB1 I G
## HLA_class_I HLA_class_I HLA_class_I I F
## HLA_class_II HLA_class_II HLA_class_II I F
As the object extends the count matrix during the object generation, its abundant to compute scaling factors on the raw data prior to extending and integrating the data layers. The scaling factors are used for scaling normalization in a later step of the SCAE constructor.
colData(scae)
## DataFrame with 7501 rows and 3 columns
## Sample Barcode sizeFactor
## <character> <character> <numeric>
## ATACTTAGGACAGTGGTAGGCTACAGA EGA_WTA ATACTTAGGACAGTGGTAGG.. 3.491050
## CAGAATCGTGAGTGCGAAGGTGAGTTA EGA_WTA CAGAATCGTGAGTGCGAAGG.. 0.493435
## AGCCATCACACCTGAGTCCAATACAAG EGA_WTA AGCCATCACACCTGAGTCCA.. 2.328737
## CTTCACATAGTGCAGTAATACTTCGGA EGA_WTA CTTCACATAGTGCAGTAATA.. 2.316401
## ATGTAATGGCCGTATCTAACTGAATTC EGA_WTA ATGTAATGGCCGTATCTAAC.. 0.190521
## ... ... ... ...
## GCAATCCGATGTGAAGAAAGCTATGTG EGA_WTA GCAATCCGATGTGAAGAAAG.. 0.581156
## GAGCCAATAAGCGACACCAACGTGTGA EGA_WTA GAGCCAATAAGCGACACCAA.. 1.894241
## CACACACTAATCCTAGGACAAACGTGG EGA_WTA CACACACTAATCCTAGGACA.. 2.538447
## ACACACAAATTGCGTACATTCAGCTCA EGA_WTA ACACACAAATTGCGTACATT.. 0.150772
## ATCAGAGCTGTACATCTACACGATCCG EGA_WTA ATCAGAGCTGTACATCTACA.. 0.149401
Additionally to the established getters from the SCE package, new getters are implemented to retrieve the different data layers integrated in the SCAE object.
get_nigenes(scae)
## class: SingleCellAlleleExperiment
## dim: 62695 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(62695): ENSG00000160072.20 ENSG00000279928.2 ...
## ENSG00000277475.1 ENSG00000275405.1
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
head(rownames(get_nigenes(scae)))
## [1] "ENSG00000160072.20" "ENSG00000279928.2" "ENSG00000228037.1"
## [4] "ENSG00000142611.17" "ENSG00000284616.1" "ENSG00000157911.11"
get_alleles(scae)
## class: SingleCellAlleleExperiment
## dim: 14 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(14): A*02:01:01:01 A*26:01:01:01 ... DPB1*10:01:01:01
## DPB1*13:01:01:01
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
head(rownames(get_alleles(scae)))
## [1] "A*02:01:01:01" "A*26:01:01:01" "B*51:01:01:01" "B*57:01:01:01"
## [5] "C*04:01:01:01" "C*06:02:01:01"
get_agenes(scae)
## class: SingleCellAlleleExperiment
## dim: 7 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(7): HLA-A HLA-B ... HLA-DQB1 HLA-DPB1
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
head(rownames(get_agenes(scae)))
## [1] "HLA-A" "HLA-B" "HLA-C" "HLA-DRB1" "HLA-DQA1" "HLA-DQB1"
get_func(scae)
## class: SingleCellAlleleExperiment
## dim: 2 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(2): HLA_class_I HLA_class_II
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
head(rownames(get_func(scae)))
## [1] "HLA_class_I" "HLA_class_II"
In case the raw input data contains allele information that is not present in the lookup table, these alleles will be classified accordingly in the Quant_type classification column in the colData slot and can be retrieved with the get_unknown() getter. These alleles have potentially invalid identifiers and will not be considered in further generated data-layers.
get_unknown(scae)
## class: SingleCellAlleleExperiment
## dim: 0 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(0):
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
head(rownames(get_unknown(scae)))
## character(0)
Checking the expression for the allele-layer, immune gene layer and functional class layer. Allele identifiers are in the form of A*02:01:01:01. The immune genes are in the form of HLA-A and the functional classes HLA_class_I. Here we see that HLA-class I and HLA-C are the most abundant functional class and immune gene respectively, given the underlying dataset.
all_alleles <- c(rownames(get_alleles(scae)),
rownames(get_agenes(scae)),
rownames(get_func(scae)))
plotExpression(scae, all_alleles)
In the following sections, main steps for dimensional reduction are performed, offering insights into the different data layers of the SCAE object as well giving an idea on how to perform immune gene expression analysis.
The non-imune genes are combined with each of the integrated immune gene allele-aware layers to determine three different subsets.
redim_scae <- scae
ni_a <- redim_scae[c(rownames(get_nigenes(redim_scae)), rownames(get_alleles(redim_scae))), ]
ni_a
## class: SingleCellAlleleExperiment
## dim: 62709 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(62709): ENSG00000160072.20 ENSG00000279928.2 ...
## DPB1*10:01:01:01 DPB1*13:01:01:01
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
ni_g <- redim_scae[c(rownames(get_nigenes(redim_scae)), rownames(get_agenes(redim_scae))), ]
ni_g
## class: SingleCellAlleleExperiment
## dim: 62702 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(62702): ENSG00000160072.20 ENSG00000279928.2 ... HLA-DQB1
## HLA-DPB1
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
ni_f <- redim_scae[c(rownames(get_nigenes(redim_scae)), rownames(get_func(redim_scae))), ]
ni_f
## class: SingleCellAlleleExperiment
## dim: 62697 7501
## metadata(0):
## assays(2): counts logcounts
## rownames(62697): ENSG00000160072.20 ENSG00000279928.2 ... HLA_class_I
## HLA_class_II
## rowData names(4): Ensembl.ID Symbol NI_I Quant_type
## colnames(7501): ATACTTAGGACAGTGGTAGGCTACAGA CAGAATCGTGAGTGCGAAGGTGAGTTA
## ... ACACACAAATTGCGTACATTCAGCTCA ATCAGAGCTGTACATCTACACGATCCG
## colData names(3): Sample Barcode sizeFactor
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
Using the modelGeneVar() function prior to getTopHVGs. Both functions are part from the scran package.
df_ni_a <- modelGeneVar(ni_a)
head(df_ni_a)
## DataFrame with 6 rows and 6 columns
## mean total tech bio p.value
## <numeric> <numeric> <numeric> <numeric> <numeric>
## ENSG00000160072.20 3.14134e-03 4.40929e-03 3.55075e-03 8.58536e-04 0.00224164
## ENSG00000279928.2 8.13804e-05 4.96774e-05 9.19867e-05 -4.23093e-05 0.99999997
## ENSG00000228037.1 1.04811e-04 8.24006e-05 1.18471e-04 -3.60701e-05 0.99982732
## ENSG00000142611.17 2.89372e-04 2.24889e-04 3.27086e-04 -1.02197e-04 0.99987990
## ENSG00000284616.1 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 NaN
## ENSG00000157911.11 1.63798e-03 1.85760e-03 1.85146e-03 6.14107e-06 0.48445060
## FDR
## <numeric>
## ENSG00000160072.20 0.0191929
## ENSG00000279928.2 1.0000000
## ENSG00000228037.1 1.0000000
## ENSG00000142611.17 1.0000000
## ENSG00000284616.1 NaN
## ENSG00000157911.11 1.0000000
df_ni_g <- modelGeneVar(ni_g)
head(df_ni_g)
## DataFrame with 6 rows and 6 columns
## mean total tech bio p.value
## <numeric> <numeric> <numeric> <numeric> <numeric>
## ENSG00000160072.20 3.14134e-03 4.40929e-03 3.56159e-03 8.47700e-04 0.00210309
## ENSG00000279928.2 8.13804e-05 4.96774e-05 9.22675e-05 -4.25901e-05 0.99999999
## ENSG00000228037.1 1.04811e-04 8.24006e-05 1.18832e-04 -3.64317e-05 0.99988648
## ENSG00000142611.17 2.89372e-04 2.24889e-04 3.28084e-04 -1.03195e-04 0.99992239
## ENSG00000284616.1 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 NaN
## ENSG00000157911.11 1.63798e-03 1.85760e-03 1.85711e-03 4.90883e-07 0.49873188
## FDR
## <numeric>
## ENSG00000160072.20 0.0180077
## ENSG00000279928.2 1.0000000
## ENSG00000228037.1 1.0000000
## ENSG00000142611.17 1.0000000
## ENSG00000284616.1 NaN
## ENSG00000157911.11 1.0000000
df_ni_f <- modelGeneVar(ni_f)
head(df_ni_f)
## DataFrame with 6 rows and 6 columns
## mean total tech bio p.value
## <numeric> <numeric> <numeric> <numeric> <numeric>
## ENSG00000160072.20 3.14134e-03 4.40929e-03 3.57652e-03 8.32770e-04 0.00233094
## ENSG00000279928.2 8.13804e-05 4.96774e-05 9.26542e-05 -4.29769e-05 0.99999999
## ENSG00000228037.1 1.04811e-04 8.24006e-05 1.19330e-04 -3.69298e-05 0.99991529
## ENSG00000142611.17 2.89372e-04 2.24889e-04 3.29459e-04 -1.04570e-04 0.99994261
## ENSG00000284616.1 0.00000e+00 0.00000e+00 0.00000e+00 0.00000e+00 NaN
## ENSG00000157911.11 1.63798e-03 1.85760e-03 1.86490e-03 -7.29427e-06 0.51895488
## FDR
## <numeric>
## ENSG00000160072.20 0.0199549
## ENSG00000279928.2 1.0000000
## ENSG00000228037.1 1.0000000
## ENSG00000142611.17 1.0000000
## ENSG00000284616.1 NaN
## ENSG00000157911.11 1.0000000
Plotting the variance for each data layer.
#plot variance
par(mfrow=c(1,3))
fit_ni_a <- metadata(df_ni_a)
plot(fit_ni_a$mean, fit_ni_a$var,
xlab = "mean of log-expression", ylab = "variance of log-expression", main = "NI-A")
points(fit_ni_a$mean, fit_ni_a$var, col = "red", pch = 16)
curve(fit_ni_a$trend(x), col = "dodgerblue", add = TRUE, lwd = 2)
fit_ni_g <- metadata(df_ni_g)
plot(fit_ni_g$mean, fit_ni_g$var,
xlab = "mean of log-expression", ylab = "variance of log-expression", main = "NI-G")
points(fit_ni_g$mean, fit_ni_g$var, col = "red", pch = 16)
curve(fit_ni_g$trend(x), col = "dodgerblue", add = TRUE, lwd = 2)
fit_ni_f <- metadata(df_ni_f)
plot(fit_ni_f$mean, fit_ni_f$var,
xlab = "mean of log-expression", ylab = "variance of log-expression", main = "NI-F")
points(fit_ni_f$mean, fit_ni_f$var, col = "red", pch = 16)
curve(fit_ni_f$trend(x), col = "dodgerblue", add = TRUE, lwd = 2)
par(mfrow=c(1,1))
Compute a list of HVGs for each data layer. Return the top 0.1 % HVGs per layer using getTopHVGs.
top_ni_a <- getTopHVGs(df_ni_a, prop = 0.1)
length(top_ni_a)
## [1] 1090
head(top_ni_a, n = 15)
## [1] "ENSG00000244734.4" "ENSG00000188536.13" "ENSG00000206172.8"
## [4] "ENSG00000143546.10" "ENSG00000115523.17" "ENSG00000197746.15"
## [7] "ENSG00000090382.7" "ENSG00000163220.11" "ENSG00000147454.14"
## [10] "ENSG00000019582.17" "C*06:02:01:01" "ENSG00000087086.15"
## [13] "ENSG00000162722.9" "ENSG00000158869.11" "C*04:01:01:01"
top_ni_g <- getTopHVGs(df_ni_g, prop = 0.1)
length(top_ni_g)
## [1] 1075
head(top_ni_g, n = 15)
## [1] "ENSG00000244734.4" "ENSG00000188536.13" "ENSG00000206172.8"
## [4] "ENSG00000143546.10" "ENSG00000115523.17" "ENSG00000197746.15"
## [7] "ENSG00000090382.7" "ENSG00000163220.11" "ENSG00000147454.14"
## [10] "ENSG00000019582.17" "ENSG00000087086.15" "ENSG00000162722.9"
## [13] "ENSG00000158869.11" "ENSG00000223609.11" "ENSG00000211592.8"
top_ni_f <- getTopHVGs(df_ni_f, prop = 0.1)
length(top_ni_f)
## [1] 1058
head(top_ni_f, n = 15)
## [1] "ENSG00000244734.4" "ENSG00000188536.13" "ENSG00000206172.8"
## [4] "ENSG00000143546.10" "ENSG00000115523.17" "ENSG00000197746.15"
## [7] "ENSG00000090382.7" "ENSG00000163220.11" "ENSG00000147454.14"
## [10] "ENSG00000019582.17" "ENSG00000087086.15" "ENSG00000162722.9"
## [13] "ENSG00000158869.11" "ENSG00000223609.11" "ENSG00000211592.8"
Compute PCA for each layer and store the results in the object. Its Important to make unique identifiers for each layer/run or the results will be overwritten and just saved as PCA. Here, the runPCA functions from the scater package is used.
set.seed(18)
redim_scae <- runPCA(redim_scae, ncomponents = 10, subset_row = top_ni_a, exprs_values = "logcounts", name = "PCA_a")
redim_scae <- runPCA(redim_scae, ncomponents = 10, subset_row = top_ni_g, exprs_values = "logcounts", name = "PCA_g")
redim_scae <- runPCA(redim_scae, ncomponents = 10, subset_row = top_ni_f, exprs_values = "logcounts", name = "PCA_f")
reducedDimNames(redim_scae)
## [1] "PCA_a" "PCA_g" "PCA_f"
The same goes for running t-SNE with the runTSNE function from the scater package. Unique identifiers are stated here for each layer as well.
set.seed(18)
redim_scae <- runTSNE(redim_scae, dimred= "PCA_a", perplexity = 50, name = "TSNE_a_50")
set.seed(18)
redim_scae <- runTSNE(redim_scae, dimred= "PCA_g", perplexity = 50, name = "TSNE_g_50")
set.seed(18)
redim_scae <- runTSNE(redim_scae, dimred= "PCA_f", perplexity = 50, name = "TSNE_f_50")
List of results from the performed reduced dimension analysis.
reducedDimNames(redim_scae)
## [1] "PCA_a" "PCA_g" "PCA_f" "TSNE_a_50" "TSNE_g_50" "TSNE_f_50"
Exemplary visualization for the t-SNE results on gene level for immune genes that relate to HLA-class I. In the given dataset, these are the immune genes HLA-A, HLA-B and HLA-C plotted alongside their alleles. This allows for insights into potential genetic differences shown on allele-level.
#determining which t-SNE (perplexity) to choose for all visualizations
which_tsne <- "TSNE_g_50"
#EGA_hla_a_alleles
tsne_g_a <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "HLA-A") + ggtitle("HLA-A gene")
tsne_g_a1 <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "A*02:01:01:01") + ggtitle("Allele A*02:01:01:01")
tsne_g_a2 <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "A*26:01:01:01") + ggtitle("Allele A*26:01:01:01")
p2 <- tsne_g_a + tsne_g_a1 + tsne_g_a2
p2
tsne_g_b <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "HLA-B") + ggtitle("HLA-B gene")
tsne_g_b1 <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "B*51:01:01:01") + ggtitle("Allele B*51:01:01:01")
tsne_g_b2 <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "B*57:01:01:01") + ggtitle("Allele B*57:01:01:01")
p3 <- tsne_g_b + tsne_g_b1 + tsne_g_b2
p3
#EGA_hla_c_alleles
tsne_g_g <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "HLA-C") + ggtitle("HLA-C gene")
tsne_g_c1 <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "C*04:01:01:01") + ggtitle("Allele C*04:01:01:01")
tsne_g_c2 <- plotReducedDim(redim_scae, dimred = which_tsne, colour_by = "C*06:02:01:01") + ggtitle("Allele C*06:02:01:01")
p1 <- tsne_g_g + tsne_g_c1 + tsne_g_c2
p1
sessionInfo()
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Europe/Berlin
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] viridis_0.6.4 viridisLite_0.4.2
## [3] cowplot_1.1.1 patchwork_1.1.3
## [5] lubridate_1.9.2 forcats_1.0.0
## [7] stringr_1.5.0 dplyr_1.1.2
## [9] purrr_1.0.2 readr_2.1.4
## [11] tidyr_1.3.0 tibble_3.2.1
## [13] tidyverse_2.0.0 gridExtra_2.3
## [15] scater_1.28.0 ggplot2_3.4.3
## [17] scran_1.28.2 scuttle_1.10.2
## [19] SingleCellAlleleExperiment_0.0.0.9000 SingleCellExperiment_1.22.0
## [21] SummarizedExperiment_1.30.2 Biobase_2.60.0
## [23] GenomicRanges_1.52.0 GenomeInfoDb_1.36.2
## [25] IRanges_2.34.1 S4Vectors_0.38.1
## [27] BiocGenerics_0.46.0 MatrixGenerics_1.12.3
## [29] matrixStats_1.0.0 BiocStyle_2.28.1
##
## loaded via a namespace (and not attached):
## [1] rstudioapi_0.15.0 jsonlite_1.8.7
## [3] magrittr_2.0.3 magick_2.7.5
## [5] ggbeeswarm_0.7.2 farver_2.1.1
## [7] rmarkdown_2.24 zlibbioc_1.46.0
## [9] vctrs_0.6.3 memoise_2.0.1
## [11] DelayedMatrixStats_1.22.6 RCurl_1.98-1.12
## [13] htmltools_0.5.6 S4Arrays_1.0.5
## [15] progress_1.2.2 curl_5.0.2
## [17] BiocNeighbors_1.18.0 Rhdf5lib_1.22.0
## [19] rhdf5_2.44.0 sass_0.4.7
## [21] bslib_0.5.1 cachem_1.0.8
## [23] igraph_1.5.1 lifecycle_1.0.3
## [25] pkgconfig_2.0.3 rsvd_1.0.5
## [27] Matrix_1.6-0 R6_2.5.1
## [29] fastmap_1.1.1 GenomeInfoDbData_1.2.10
## [31] digest_0.6.33 colorspace_2.1-0
## [33] AnnotationDbi_1.62.2 dqrng_0.3.0
## [35] irlba_2.3.5.1 RSQLite_2.3.1
## [37] org.Hs.eg.db_3.17.0 beachmat_2.16.0
## [39] labeling_0.4.3 filelock_1.0.2
## [41] fansi_1.0.4 timechange_0.2.0
## [43] httr_1.4.7 abind_1.4-5
## [45] compiler_4.3.1 bit64_4.0.5
## [47] withr_2.5.0 BiocParallel_1.34.2
## [49] DBI_1.1.3 highr_0.10
## [51] R.utils_2.12.2 HDF5Array_1.28.1
## [53] biomaRt_2.56.1 rappdirs_0.3.3
## [55] DelayedArray_0.26.7 bluster_1.10.0
## [57] tools_4.3.1 vipor_0.4.5
## [59] beeswarm_0.4.0 R.oo_1.25.0
## [61] glue_1.6.2 rhdf5filters_1.12.1
## [63] grid_4.3.1 Rtsne_0.16
## [65] cluster_2.1.4 generics_0.1.3
## [67] gtable_0.3.4 tzdb_0.4.0
## [69] R.methodsS3_1.8.2 hms_1.1.3
## [71] BiocSingular_1.16.0 ScaledMatrix_1.8.1
## [73] metapod_1.8.0 xml2_1.3.5
## [75] utf8_1.2.3 XVector_0.40.0
## [77] ggrepel_0.9.3 pillar_1.9.0
## [79] limma_3.56.2 BiocFileCache_2.8.0
## [81] lattice_0.21-8 bit_4.0.5
## [83] tidyselect_1.2.0 locfit_1.5-9.8
## [85] Biostrings_2.68.1 knitr_1.43
## [87] bookdown_0.35 edgeR_3.42.4
## [89] xfun_0.40 statmod_1.5.0
## [91] DropletUtils_1.20.0 stringi_1.7.12
## [93] yaml_2.3.7 evaluate_0.21
## [95] codetools_0.2-19 BiocManager_1.30.22
## [97] cli_3.6.1 munsell_0.5.0
## [99] jquerylib_0.1.4 Rcpp_1.0.11
## [101] dbplyr_2.3.3 png_0.1-8
## [103] XML_3.99-0.14 parallel_4.3.1
## [105] blob_1.2.4 prettyunits_1.1.1
## [107] sparseMatrixStats_1.12.2 bitops_1.0-7
## [109] scales_1.2.1 crayon_1.5.2
## [111] rlang_1.1.1 KEGGREST_1.40.0